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Abstract 

The objective of this study was to identify quantitative trait loci (QTL) affecting 
fitness of hybrids between wild soybean (Glycine soja) and cultivated soybean 
(Glycine max). Seed dormancy and seed number, both of which are important 
for fitness, were evaluated by testing artificial hybrids of G. soja x G. max in a 
multiple-site field trial. Generally, the fitness of the F! hybrids and hybrid deriv- 
atives from self-pollination was lower than that of G. soja due to loss of seed 
dormancy, whereas the fitness of hybrid derivatives with higher proportions of 
G. soja genetic background was comparable with that of G. soja. These differ- 
ences were genetically dissected into QTL for each population. Three QTLs for 
seed dormancy and one QTL for total seed number were detected in the F 2 
progenies of two diverse cross combinations. At those four QTLs, the G. max 
alleles reduced seed number and severely reduced seed survival during the win- 
ter, suggesting that major genes acquired during soybean adaptation to cultiva- 
tion have a selective disadvantage in natural habitats. In progenies with a 
higher proportion of G. soja genetic background, the genetic effects of the 
G. max alleles were not expressed as phenotypes because the G. soja alleles were 
dominant over the G. max alleles. Considering the highly inbreeding nature of 
these species, most hybrid derivatives would disappear quickly in early self- 
pollinating generations in natural habitats because of the low fitness of plants 
carrying G. max alleles. 
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Introduction 

Many crop species have evolved through recurrent cycles of 
hybridization with their wild and/or weedy relatives followed 
by differentiation (Harlan 1992). Gene flow from crops to 
their wild relatives has been commonly observed in many 
crop species (Ellstrand et al. 1999; Ellstrand 2003). There is 
a concern that transgenes in crops will persist in the gene 



pool of wild relatives and lead to negative environmental 
effects because of the difficulty in controlling gene flow com- 
pletely when genetically modified (GM) crops are field 
planted. Possible concerns related to transgene introgression 
are the evolution of aggressive weeds from hybrid derivatives 
(Warwick et al. 2009), the influence on nontarget insects 
(O'Callaghan et al. 2005), and the changes in genetic diver- 
sity of wild populations (Levin et al. 1996; Lu 2008). 
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The probability of transgene introgression from a crop 
species into a wild species is largely dependent on the fit- 
ness of the F[ hybrid and subsequent generations. Fitness 
may be defined as the relative ability of an individual to 
survive and successfully reproduce in a given environ- 
ment, with the most fit individuals leaving the greatest 
number of offspring (Jenczewski et al. 2003). Fitness is 
not only a characteristic of entire genome, it is also a 
property of individual genes and chromosomal segments 
(Harrison 1990). The persistence of transgenes from crop 
plants within the genomes of crop wild relatives is depen- 
dent on the fitness conferred by the transgene and by 
linked genomic regions (Gressel 1999; Jenczewski et al. 
2003; Stewart et al. 2003). The fitness of plants carrying 
domestication-related genes is assumed to be lower than 
that of their wild relatives when tested in natural habitats 
(De Wet and Harlan 1975). Transgenes would be 
expected to disappear in natural populations when linked 
with domestication-related genes that lead to a selective 
disadvantage in wild habitats, such as seed dormancy and 
seed shattering (Gressel 1999; Stewart et al. 2003). 

On the other hand, chromosomal blocks can introgress 
at a higher rate than expected when they contain advanta- 
geous gene combinations with positive fitness conse- 
quences (Rieseberg et al. 1996). There are some cases in 
which hybrids between wild relatives and crop plants may 
be as fit as or even more fit than their parents; examples 
have been found in Brassica (Snow et al. 1999; Di et al. 
2009), Raphanus (Hovick et al. 2012), and Sorghum 
(Sahoo et al. 2010). Baack et al. (2008) found that some 
alleles from cultivated sunflower (Helianthus annuus L.) 
are favored in a noncrop environment and in wild genetic 
backgrounds. Depending on the effect of the inserted gene 
itself, transfer of transgenes could lead to a change in allele 
frequencies through a selective advantage conferred to the 
recipient (Hails 2000; Gepts and Papa 2003; Jenczewski 
et al. 2003; Snow et al. 2010; Hartman et al. 2012). 

Genetically modified soybean {Glycine max) is econom- 
ically important and accounted for 81% of the worldwide 
planting area of soybean in 2012 (81 million ha; James 
2012). The annual wild species Glycine soja is found in 
eastern and northeastern China, Japan, Korea, and far 
eastern Russia (Carter et al. 2004). In Japan, G. soja is 
distributed widely in disturbed habitats such as river- 
banks, roadsides, and even at the edges of soybean fields 
(Kaga et al. 2005; Kuroda et al. 2005, 2006b, 2007). 
Reproductive barriers have not been observed between 
G. max and G. soja, and the crosses can produce fertile Fj 
hybrids (Singh and Hymowitz 1989; Carter et al. 2004). 
The risk of transgene dispersal within Glycine is assumed 
to be very low in Japan because (1) outcrossing rates 
between G. max and G. soja are generally less than 1% 
(Nakayama and Yamaguchi 2002; Kuroda et al. 2008; 
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Mizuguti et al. 2009), (2) natural Fi hybrids between G. 
max and G. soja are rare in Japan, and (3) plants derived 
from those hybrids survived only one to a few years in 
natural habitats (Kaga et al. 2005; Kuroda et al. 2005, 
2006b, 2007). However, the genetic and ecological mecha- 
nisms for this lack of persistence remain unclear. 

During the domestication of soybean, G. max evolved 
from G. soja to have large and nondormant seeds with a 
determinate nontwining growing habit that may affect fit- 
ness in natural habitats. The seed dormancy of wild soy- 
bean is caused by the physical structure of the seed coat, 
which usually does not imbibe water immediately after 
immersion (Rolston 1978; Ohara and Shimamoto 1994). 
In contrast, G. max bears seeds with little to no dormancy 
because uniform and rapid germination are important for 
soybean cultivation and food processing. Oka (1983) ana- 
lyzed reproductive success in seminatural conditions by 
using hybrid derivatives between G. max and G. soja and 
found that plants with high seed dormancy and high seed 
production successfully survived. Thus, knowledge of 
genomic regions affecting fitness-related traits helps us to 
understand the reasons why hybrid derivatives between 
G. max and G. soja are rare in natural habitat. Although 
domestication-related quantitative trait loci (QTL) such 
as seed size and growth habit have previously been 
reported (e.g., Liu et al. 2007), no attempt has been made 
to identify genetic factors affecting the number of seeds 
per plant and winter seed survival in the soil. 

In this study, artificial Vi hybrids, F 2 populations, and 
backcross populations were made of two combinations of 
G. soja (W - Wild) and non-GM G. max (D - Domesti- 
cated); these combinations had different growth habits 
and represented northern and southern Japanese germ- 
plasm based on the assumption that gene flow from GM 
G. max to G. soja occurs in both northern and southern 
Japan. The degree of fitness of the hybrids and their 
derivatives was compared with their G. soja and non-GM 
G. max parents in three regions of Japan: north, central, 
and south. On the basis of the results, we discuss the like- 
lihood of persistence of transgenes from G. max in G. soja 
populations. This is the first report of the detection of 
QTLs affecting fitness-related traits such as winter seed 
survival and seed number per plant of G. soja x G. max 
hybrids in experimental fields. 

Materials and Methods 
Plant materials 

F n hybrids between wild and cultivated soybean 

Fj hybrids between G. soja and non-GM G. max were 
produced for two cross combinations. One combination 
(Wl x Dl) was developed from a cross between the 
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G. soja accession "JP036034" (Wl) collected in Aomori 
Prefecture, northern Japan, and non-GM G. max cultivar 
"Ryuhou" (Dl), which is widely grown in the northern 
Japan. The other combination (W2 x D2) was developed 
from a cross between G. soja accession "JP1 10755" (W2) 
collected in Hiroshima prefecture in southern Japan, and 
non-GM G. max "Fukuyutaka"(D2), which is widely grown 
in southern Japan (Table 1). The wild soybean accessions 
used in these crosses were obtained from the Genebank of 
the National Institute of Agrobiological Sciences. Non-GM 
G. max cultivars, Dl and D2, were obtained from the 
Tohoku Agricultural Research Center and Kyushu Okinawa 
Agricultural Research Center, respectively. 

F 2 populations 

F 2 populations, which might be expected to grow in natu- 
ral habitats, were developed for testing because of the 
highly inbreeding nature of soybean. Two F 2 populations, 
one representing the northern region (Wl x Dl: 204 
individuals) and the second representing the southern 
region (W2 x D2: 204 individuals) were developed from 
seeds by self-pollination of a single Fx hybrid plant per 
cross (Table 1). 

Backcross populations 

To confirm the effect of G. max genes in a predominantly 
G. soja background, backcross (BC) populations were 
developed for both Wl x Dl and W2 x D2 combina- 
tions using G. soja as the recurrent parent (Table 1, 
Fig. 1). Two BCiFj populations (Wl x Dl: 68 individu- 



als; W2 x D2: 160 individuals) were obtained from cross- 
ing each F! hybrid (donor plant) to the corresponding 
G. soja accession (recurrent parent). The success of the 
crossing was confirmed using 60 simple sequence repeat 
(SSR) markers composed of three markers per linkage 
group for all 20 linkage groups. Furthermore, two BQFj 
populations (Wl x Dl: 60 individuals; W2 x D2: 40 
individuals) were developed by crossing selected BCiFj 
plants (one plant per population) to G. soja. The selection 
of the BC^ plants, having major G. max QTLs for seed 
dormancy and total seed number identified in F 2 popula- 
tions, was based on the genotypes of the BQF! popula- 
tions. To investigate the fitness of the populations after 
an additional generation of self-pollination, the seeds 
obtained from self-pollination of the selected BQFj plants 
were used to develop two BQF 2 populations (Wl x Dl: 
150 individuals; W2 x D2: 150 individuals). 

Field locations 

Among 204 F 2 plants in the Wl x Dl population, 104 F 2 
plants, together with the 10 parents and several Fx hybrid 
(Table 1), were grown at 1 m x 1 m spacing at the 
Tohoku Agricultural Research Center (39.5°N, 140.4°E, 
Akita Prefecture, northern Japan, Appendices Al and A2), 
hereafter referred to as the "north field." The other 100 F 2 
plants, together with the parents and F! hybrid, were 
grown at the same density at the Western Region Agricul- 
tural Research Center (34.5°N, 133.4°E, Hiroshima prefec- 
ture, southern Japan, Appendices Al and A2), hereafter 
referred to as the "south field." The W2 x D2 population, 
composed of 204 F 2 plants, was grown in the north and 



Table 1. Artificial hybrids and populations between Glycine soja and G. max evaluated in this study. 



Combination 










Genetic linkage 


Total map 


(G.soja x G.max) 


Generation 


Year 


Field 


Pedigree 


map (no. loci) 


length (cM) 


W1 1 x D1 2 


F, 


2005 


North 5 , Central 6 , South 7 


G. soja (W) x G. max (C) 








F 2 


2005 


North, south 


W x C 


212 


2720 




BC,F, 


2006 


Central 


W / (W x C) 


214 


2609 




BC 2 F, 


2007 


Central 


W / W / (W x C) 


103 


994 




BC,F 2 


2007 


Central 


W / (W x C) 


105 


931 


W2 3 x D2 4 


F, 


2005 


North, central, south 


W x C 








h 


2005 


North, south 


W x C 


208 


2547 




BC,F, 


2006 


Central 


W / (W x C) 


199 


2514 




BC 2 F, 


2007 


Central 


W / W / (W x C) 


72 


572 




BC,F 2 


2007 


Central 


W / (W x C) 


72 


599 



1 W1 (JP036034): G. soja collected from Aomori Prefecture in northern Japan. 

2 D1 (Ryuhou): G. max cultivar commonly planted in northern Japan. 

3 W2 (JP1 10755): G. soja collected from Hiroshima Prefecture in southern Japan. 

4 D2 (Fukuyutaka): G. max cultivar commonly planted in southern Japan. 

5 North field: Akita prefecture, northern Japan. 

6 Central field: Ibaraki prefecture, central Japan. 

7 South field: Hiroshima prefecture, southern Japan. 
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November, mature pods with seeds were harvested by hand 
twice a week. Standard agricultural practices such as appli- 
cations of fertilizer (650 kg/ha of 3 parts nitrogen, 10 
parts phosphate, and 10 parts potassium; 1000 kg/ha of 
fused magnesium phosphate; 1000 kg/ha of limestone), 
weeding, insecticides to control stink bug and common 
cutworm, were conducted. 

Trait measurement 

Of a total of 11 fitness-related traits (Table 2), 10 were 
treated as quantitative traits and one (seed coat color) 
was treated as a qualitative trait. Two seed dormancy- 
related traits, namely seed winter survival (DORM_l) and 
seed hardness (DORM_2), were evaluated using the seeds 
from individual plants. As germination of all the hard 
seeds from randomly selected lines were confirmed by the 
mechanical abrasion on the wetted filter paper or soil at 
room temperature, hard seeds were treated as viable 
nongerminated seeds. Seed production-related traits, 
namely total seed number (PROD_l), seed total 
weight (PROD_2), 100-seed weight (PROD_3), total pod 
number (PROD_4), stem dry weight (PROD_5), and 
stem length (PROD_6) were evaluated for each plant 
(Table 2). Those traits were recorded on a per plant basis 
after the seeds of each plant had matured. As flowering of 
the two southern accessions (W2 and D2) as well as Fj 
plants and most of the W2 x D2 F 2 plants were late 
flowering in the north field, those whole plants (7 of 
10 W2, all 5 D2, all 5 of the W2 x D2 F l5 and 73 of 
104 W2 x D2 F 2 plants) were taken from the field before 
the first snowfall and dried in the greenhouse to obtain 
mature seeds. The methods of trait evaluation in the 



Table 2. Fitness-related traits evaluated in this study. 



General attribute 


Trait (unit) 


Abbreviation 


Evaluation method 


Seed dormancy 


Seed winter survival (%) 


DORMJ 


Percentage of germinated and viable nongerminated 
seeds after burial of mesh bags containing 20 unscarified 
seeds 3 cm below the soil surface of each experimental field 
(north, central, or south) from late December to the following 
spring (April-May) 




Seed hardness (%) 


DORM_2 


Percentage of nonimbibed seeds after soaking for 4 days in an 
incubator at 4°C 




Seed coat color 


DORM_3 


Black or other (buff, green, or brown) 


Seed production 


Total seed number 


PRODJ 


Total number of harvested seeds 




Total seed weight (g) 


PROD_2 


Total weight of harvested seeds 




1 00-seed weight (g) 


PROD_3 


1 00-seed weight 




Total pod number 


PROD_4 


Total number of harvested pods 




Stem dry weight (g) 


PROD_B 


Stem weight after drying for 7 days at 70°C 




Stem length (cm) 


PROD_6 


Length from ground to top of stem 


Seed dormancy & production 


Total number of seeds 
expected to germinate 
in the following year 


SURV 


Total number of seeds (PROD_1) multiplied by the winter seed survival 
rate (DORM_1) 


Flowering phenology 


Days to first flower (day) 


FLOW 


Number of days from sowing to flowering of first flower 
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south fields in the same manner as the Wl x Dl popula- 
tion (Table 1). As maintenance and evaluation of many 
climbing plants with a higher G. soja background are diffi- 
cult, backcross populations (BCiF^ BC 2 F!, and BQF2) 
were only grown at 1 m x 1 m density at the National 
Institute of Agrobiological Sciences (36.0°N, 140. 1°E, Iba- 
raki Prefecture, central Japan, Appendices Al and A2), 
hereafter referred to as the 'central field'. Seed coats were 
scratched with a razor blade and germinated in a small pot 
at the beginning of July in 2005, at the middle of June in 
2006, and at the end of May for in 2007. The seedlings 
were transplanted to the field in the middle of July every 
year. Three stakes with a net strung between the stakes per 
plant were used to guide twining stems. During October to 




Figure 1. Morphological differences in wild soybean (Glycine soja, 
W2) with cultivated soybean (G. max, D2); Twining growing habit 
and small blackish seeds. Photo: Akito Kaga. 
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backcrossing populations were the same as for the F 2 
populations. The number of days from sowing to first 
flowering was recorded as FLOW. The total number of 
seeds expected to germinate in the following year (SURV) 
was estimated as PROD_l multiplied by DORM_l. The 
mean and standard deviation for each trait and the corre- 
lation coefficient between each pair of traits were calcu- 
lated. Differences in mean values between G. soja, G. max, 
and F x hybrids were analyzed separately for each field 
location in each year with the Mann-Whitney [/-test or 
Kruskal-Wallis test. The median and range instead of 
mean and standard deviation are reported for segregating 
populations. All statistical analyses were conducted using 
R version 2.9.2 (R Development Core Team 2009). 

Genotyping 

Total DNA of each putative F! seed was extracted from a 
small piece of cotyledon tissue using an EZ1 DNA Tissue 
kit (Qiagen, Tokyo, Japan). Total DNA of F 2> BCjFj, BC^, 
and BQF2 individuals was extracted from 100 mg of fresh 
leaf tissue. DNA concentration was adjusted between 5 and 
25 ng//(L by comparing with known concentrations of 
standard 1 DNA on a 1.5% agarose gel. A total of 720 SSR 
markers from SoyBase (http://soybase.org/) were screened 
to detect polymorphisms between the parents. Five markers 
were also included to track the three classical soybean loci, 
I, T, and Dtl (Appendix A3). Three markers, dCHSl (Mat- 
sumura et al. 2005), AY262686B, and AY262686Z, were 
used to track the I locus, which controls seed coat color 
and might be related to seed winter survival. A single-base 
indel marker sF3'Hl reported by Toda et al. (2002) was 
used to detect the T locus, a locus that controls pubescence 
color and interacts with the J locus. A SSR marker LFsoy3 
was designed to track the Dtl locus, which might be related 
to stem length and seed total number. These markers were 
amplified by using KOD-plus polymerase (Toyobo, Osaka, 
Japan), based on the manufacturer's guide, in a GeneAmp 
9700 PCR system (Applied Biosystems, Tokyo, Japan). 
Polymorphisms were scored by using banding patterns in 
12% polyacrylamide gel. 

Successful crossing was confirmed by analysis of DNA 
from putative Fj seeds, based on the genotype of the poly- 
morphic SSR marker Satt207, which has a different allele 
in each of the four parents (Wl, 177 bp; Dl, 234 bp; W2, 
210 bp; and D2, 231 bp). To genotype F 2 , BC^, BC 2 F 1; 
and BC^ individuals, polymorphic markers were selected 
at about 20-cM intervals based on the composite map of 
soybean from SoyBase (http://soybase.org/). Using four 
types of fluorescent labels (6-FAM, VIC, NED, or PET), 
multiplex PCR was performed to detect segregation pat- 
terns within each population. The PCR reaction mixture 
consisted of a total volume of 5 /(L, containing 1.7 //X of 



template DNA, 2.5 /iL of 2 x Qiagen Multiplex PCR Mas- 
ter Mix, 0.5 /iL of a four-primer mix (1.25 /miol/L each), 
and 0.3 /iL of water. PCR amplification was perform in a 
GeneAmp 9700 (Applied Biosystems) or iCycler (BioRad, 
Tokyo, Japan) thermal cycler programmed with an initial 
activation step at 95°C for 15 min; followed by 40 cycles 
of 30 sec at 94°C for denaturation, 90 sec at 57°C for 
annealing, and 60 sec at 72°C for extension; followed by 
30 min at 60°C for final extension. For analysis, 3 /iL of 
PCR product was denatured at 95°C for 5 min after mix- 
ing with 10 /(L of Hi-Di formamide (Applied Biosystems) 
and 15 nL of GeneScan-500LIZ size standard (Applied 
Biosystems). Denatured samples were analyzed by using a 
3100 Genetic Analyser (Applied Biosystems) and the out- 
put was analyzed using Gene Mapper 3.0 software 
(Applied Biosystems). 

Linkage map construction 

Linkage maps were constructed for F 2i BQF^ BC^, and 
BC^ populations by using Joinmap ver. 3.0 software 
(Van Ooijen and Voorrips 2001) according to the method 
of Han et al. (2005). The recombination frequencies were 
converted into map distances using the Kosambi mapping 
function (Kosambi 1944). 

QTL analysis 

The QTL analysis for phenotypic data from the BCiF^ 
BC 2 F 1; and BCiF 2 individuals was conducted with Multi- 
QTL ver. 2.6 software according to Peng et al. (2003). For 
phenotypic data of the F 2 individuals from the two field 
environments (north and south), a single QTL with multi- 
ple environment model was fitted to scan the entire gen- 
ome (Korol et al. 1998, 2001). Statistical significance 
thresholds (a = 0.05) for putative QTLs were tested by 
10,000 runs of a permutation test (Churchill and Doerge 
1994). Multiple interval mapping (Kao et al. 1999) was 
then conducted to reduce the background variation by tak- 
ing into account QTL effects from other chromosomes. 
After the permutation test runs, the parameters of signifi- 
cant QTLs (statistical thresholds a. = 0.05) were reported as 
position, additive and dominant effects, and percentage of 
variance explained (PVE). 

Results 

Fitness of hybrids and their derivatives 

Cultivated and wild soybean 

The following domestication-related traits generally 
differed between the G. soja and G. max parents for both 
combinations (Wl x Dl and W2 x D2) tested at all 
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three field locations (north, central, and south) in 2005 
(Table 3). The means of G. max were higher than those 
of G. soja for PROD_3, whereas the means of G. soja 
were generally higher than those of G. max for DORM_l, 
DORM_2, PROD_l, PROD_4, PROD_6, SURV, and 
FLOW. In contrast, PROD_2 and PROD_5 were not 
notably different between G. soja and G. max. Especially, 
the means of G. max for PROD_2 tended to be similar to 
or higher than those of G. soja at their recommended 
regions for growing. Although no G. max data were 
obtained from the south field in 2005, we confirmed 
these trends in 2004 (A. Kaga & Y. Kuroda, unpublished 
data). 



hybrids and F 2 populations 

The phenotypic values of the F t and F 2 generations in most 
field locations were intermediate between G. soja and 
G. max for DORM_l, DORM_2, PROD_l, PROD_3, 
PROD_4, PROD_6, SURV, and FLOW (Table 3, Fig. 2A 
and B). However, the means of PROD_2 and PROD_5 in 
the Fj and F 2 generations tended to be similar to or higher 
than those of G. soja at the recommended regions for grow- 
ing the G. max parent. Most of the G. soja seeds dug up 
from the soil in the spring did not imbibe water, whereas 
the G. max seeds were rotten. Seeds from F : and F 2 plants 
were of all types: hard seeds that did not absorb water, 



(A) F, (W1D1 at north field in 2005) 



(B) F, (W2D2 at south field in 2005) 
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Figure 2. Distribution of SURV, calculated by multiplying DORM_1 and PROD_1 for each combination (W1D1 or W2D2) and generation (F 2 , 
BC 1 F 1 , BC 2 F,, or BC,F 2 ). Numbers in brackets indicate SURV values; areas between dotted lines indicate ranges of SURV values. 
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water-absorbing viable seeds, and rotten seeds. DORM_2 
was positively correlated with DORM_l (P < 0.05) in the 
F 2 generations of Wl x Dl (seeds harvested from the 
north field, R 2 = 0.81; seeds harvested from the south field, 
R 2 = 0.69, Appendix A4) and W2 x D2 (seeds harvested 
from the north field, R 2 = 0.85; seeds harvested from the 
south field, _R 2 = 0.60). 

The extent of DORM_l was associated with maternal- 
inherited seed coat color and the pubescence color of 
the F 3 seeds produced on F 2 plants (Table 4). G. soja 
has black seeds and brown pubescence, and G. max has 
yellow seeds and white pubescence. High DORM_l was 
observed for seeds with black or brown seed coat color 
produced by F 2 plants with brown pubescence color, 
and most of those seeds did not imbibe water when 
tested in the spring (brown seeds, 75.9%; black seeds, 
75.5%). The seeds with other colors of pubescence had 
relatively low DORM_l. In particular, the seeds with 
brown seed coat color produced by F 2 plants with white 
pubescence color (22 of 27 F 2 plants) were severely 
cracked or split and could not be found in the following 
spring (DORM_l, 0.2%). 

The PROD_l of the Fj plants was generally inter- 
mediate between G. soja and G. max for both the 
Wl x Dl and W2 x D2 combinations (Table 3, 
Fig. 2A and B). An exception was found in the north 
field, where PROD_l of the F x plants from the 
Wl x Dl combination (average 688) was similar to or 
higher than that of the G. soja parent (average 421). 
The mean values of PROD_2 and PROD_5 in the F; 
generation were also higher than those of the parents. 
In the next generation, PROD_l of several F 2 individu- 



als was similar to or higher than that of the G. soja 
parent. This transgressive growth of PROD_l may be 
explained by heterosis or positional effect within a field 
for plant size-related traits because of significant 
(P < 0.05) positive correlations between PROD_l and 
plant size-related traits such as PROD_5 and PROD_6 
(Appendix A4). 

The values for SURV of G. soja and G. max were 
different within both the Wl x Dl combination and the 
W2 x D2 combination because G. soja had both high 
PROD_l and DORM_l, whereas G. max had low 
PROD_l and zero DORM_l (Table 3, Fig. 2A and B). 
Average SURV of Fi plants was intermediate between G. 
soja and G. max for each combination. Greater variation 
was observed in the F 2 progenies than in the Fj 
plants because of genetic segregation of PROD_l and 
DORM_l. 

Backcross populations 

For the backcross populations (BCs; BQF!, BC 2 F 1; and 
BCiF 2 ) from both combinations, plants were grown only 
in the central field in 2006 and 2007. The phenotypic dif- 
ferentiation between G. soja and G. max in the central 
field was similar to that seen in the other fields in 2005. 
All trait values of the BCs were clearly shifted toward 
those of the G. soja recurrent parents. For both combina- 
tions, the medians of the BCiFj and BC 2 F! populations 
were very close to the means of G. soja for all traits 
(Table 3). In contrast, the extent of shift in BC^ popu- 
lations for DORM_l, DORM_2, PROD_l, and SURV_1 
was not obvious as in the BQF! and BC 2 F! populations. 



Table 4. Relationships between percentage of seed winter survival (DORM_1) and colors of seed coat and pubescence in F 2 populations. 











Seed coat color (pubescence color) 










Seed 












Yellow 


Green 






production 






Black (brown) 


Brown (white) 


Brown (brown) 


(unclassified) 


(unclassified) 




Combination 


location 






[///, 77-, fi/-] 1 


[///, tit, r/r] 


[///, 77- fi/-] 


[//- 2 „] 


[II- 2 .] 


Unclassified 


W1 x D1 


North 


DORMJ 


(%> 


92.7 ± 11. 8 a 


0.0 ± 0.0 b 


81.0 ± 12.4 ab 


33.1 ± 25. 5 b 


51.8 ± 29.2 ab 


43.0 ± 35.6 b 






n 




15 


4(4) 3 


6 


19 (1) 3 


54 


5 




South 


DORMJ 


(%) 


83.1 ± 17.4 a 


0.0 ± 0.0 b 


95.8 ± 4.9 a 


12.9 ± 14.2 b 


32.0 ± 28.4 b 








n 




17 


8(4) 3 


6 


13 (1) 3 


52 (5) 3 


0 


W2 x D2 


North 


DORMJ 


(%) 


51.3 ± 28. T 


0.0 ± 0.0 C 


57.5 ± 32.3 ab 


14.0 ± 19.2 abc 


15.9 ± 19.3 bc 


37.5 ± 23.2 abc 






n 




16 


6(6) 3 


4 


1 1 (3) 3 


55 (1) 3 


10 




South 


DORMJ 


(%) 


74.2 ± 28.4 a 


0.6 ± 1.7 b 


64.2 ± 44.4 a 


4.2 ± 8.4 b 


4.2 ± 10. 6 b 








n 




13 


9(6) 3 


6 


24 


48 (4) 3 


0 


Overall 




DORMJ 


(%) 


75.5 ± 26. 9 a 


0.2 ± 1.0 C 


7S.9 ± 30.4 a 


15.6 ± 20.8 bc 


26.5 ± 29.6 b 


42.1 ± 27.7 ab 






n 




61 


27 (22) 3 


22 


68 (3) 3 


209 (10) 3 


15 



Different alphabet among genotypes at each field location indicates significant difference at 5% level by Kruskal-Wallis test. 

1 Presumed genotypes at the /, 7", and fi loci. 

2 Genotype at the / locus would be ///' in the W2 x D2 population. 

3 Number of cracked or split seeds. 
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The plant type of the backcross generations was vigorous 
in both 2006 and 2007, when mulch sheets were used on 
the surface of soil; in contrast, the F 1 and F 2 generations, 
which were grown without the sheets in 2005, were less 
vigorous. 

Because G. soja had higher PROD_l and DORM_l 
than G. max, SURV of G. soja was higher than that of G. 
max in both the Wl x Dl and W2 x D2 combinations 
in all three backcross generations (Table 3, Fig. 2C-H). 
The medians of SURV in the BC^ and BC^ genera- 
tions were very close to G. soja; still, there was variation 
in both DORM_l and PROD_l in the BC^ and BC^ 
generations (Fig. 2C-H). Some individuals had the poten- 
tial to yield large numbers of dormant seed because the 
number of seeds (PROD_l) was greater than G. soja and 
the seed dormancy (DORM_l) was similar. 

QTL analysis for F 2 populations 

Of 720 markers screened, 359 and 378 markers revealed 
clear polymorphisms between G. soja and G. max in the 
Wl x Dl and W2 x D2 populations, respectively. Of 
these, 212 and 208 markers were used to develop F 2 link- 
age maps of the Wl x Dl and W2 x D2 populations, 
respectively (Table 1, Fig. 3). Although gaps of more than 
30 cM were observed between Satt285 and Satt414 on 
LG-J over populations and generations, the SSR markers 
were otherwise distributed evenly across the soybean 
genome, and marker orders were conserved between the 
Wl x Dl and W2 x D2 population maps as well as 
between those maps and the composite map by Song 
et al. (2004). The total lengths of the linkage maps devel- 
oped here were about 2500 cM for the F 2 and BQFi pop- 
ulations, comparable to the lengths of the SSR-based 
linkage maps developed by Song et al. (2004) (2524 cM) 
and Liu et al. (2007) (2383 cM). 

Several markers (1.4% and 3.4% of the markers in 
the Wl x Dl and W2 x D2 populations, respectively) 
showed segregation ratios significantly (P < 0.05) devi- 
ated from the expected 1:2:1 ratio of G. soja homozy- 
gote, heterozygote, and G. max homozygote. Although 
most markers with segregation distortion were scattered 
over several linkage groups and were not consistent 
between the Wl x Dl and W2 x D2 populations, five 
of the distorted markers were adjacent and located in 
the upper half of LG-C1 in the W2 x D2 population 
(Fig. 3A). Paracentric inversions and reciprocal translo- 
cations, which can lead to pollen and ovule sterilities 
and have been found between a specific Chinese acces- 
sion of G. soja and G. max (Singh and Hymowitz 
1988; Palmer et al. 2000), might account for the segre- 
gation distortions in these Japanese germplasm sources 
as well. 
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In total, 28 and 27 QTLs related to seed dormancy, 
seed production, and flowering phenology were detected 
in the F 2 generation of Wl x Dl and W2 x D2 popula- 
tions, respectively (Fig. 3, Appendix A5). Among them, 
QTLs in three regions (LG-A2, -C2, and -Dlb) had large 
effects on seed dormancy and QTL in one region (LG-L) 
had a significant effect on seed production. 

Seed dormancy 

Eight and 6 QTLs associated with seed dormancy were 
detected in the Wl x Dl and W2 x D2 populations, 
respectively (Fig. 3, Appendix A5). The G. max alleles at 
all of those QTLs had additive effects (Add.) of decreasing 
DORM_l (Add, -3 to -37%; PVE, 6.4-76.2%) and 
DORM_2 (Add, -1 to -26%; PVE, 6.2-42.7%). Three 
major QTLs, which were located on LG-A2, -C2, and 
-Dlb, were associated with DORM_l and DORM_2 in 
both populations (Fig. 3A). The G. max allele at the QTL 
on LG-A2 had larger additive effects in the W2 x D2 
population than in the Wl x Dl population. The QTL 
on LG-A2 was located near the I locus, and the QTL on 
LG-C2 was close to the T locus. The additive effect of the 
QTL on LG-A2 detected in seeds harvested from the 
south field tended to be higher than that for seeds har- 
vested from the north field. In contrast, the additive effect 
of the QTL on LG-C2 and LG-Dlb detected in seeds har- 
vested from the south field tended to be lower than that 
for seeds harvested from the north field. 

Seed production 

Sixteen and 17 QTLs related to seed production were 
detected in Wl x Dl and W2 x D2 populations, respec- 
tively (Fig. 3, Appendix A5). Most G. max alleles at those 
QTLs had additive effects of decreasing PROD_l (Add, 
-73 to -496 seeds; PVE, 8.8-33.8%), PROD_2 (Add, -5.1 
to -42.5 g; PVE, 6.8-33.5%), PROD_4 (Add, -74 to -229 
pods; PVE, 7.5-33%), PROD_5 (Add, -6.7 to -18.6 g; 
PVE, 10.2-28.8%), and PROD_6 (Add, -4 to -50 cm; 
PVE, 11.5-40.2%) and increasing PROD_3 (Add, +0.6 to 
1.1 g; PVE, 7.8-28.2%). 

The G. max alleles at several QTLs had effects opposite of 
those expected based on the parental phenotypes (Appendix 
A5), that is, in the Wl x Dl population, toward decreased 
PROD_3 for the QTL on LG-Dlb and LG-M (Add, -0.2 to 
-0.3 g; PVE, 6.4-8.1%, south field), increased PROD_4 for 
the QTL on LG-M (Add, +158; PVE, 14.8%), and increased 
PROD_5 for the QTL on LG-M (Add, +6.6 g; PVE, 
15.8%). In the W2 x D2 population, G. max alleles had 
effects toward decreased PROD_3 for the QTL on LG-C2 
(Add, -1.3 g, PVE, 43.5%), increased PROD_5 for the QTL 
on LG-D2 and LG-M (Add, + 9.5 g; PVE, 22.6%), and 
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increased PROD_6 at the QTL on LG-D2 (Add, +19.0 to 
+20.1 cm; PVE, 7.7-13.4%). 

QTLs with large effect on seed production-related traits 
such as PROD_l, PROD_2, PROD_3, PROD_4, PROD_5, 
and PROD_6 were located near a marker LFsoy3 in both 
the Wl x Dl and W2 x D2 populations (Fig. 3B). The 
G. max alleles at those QTLs, except for PROD_3, had 
additive effects of decreasing the phenotypic values for 
those traits, but the magnitude of effect differed depend- 
ing on the test location (Appendix A5). For both popula- 
tions, the additive effects for PROD_l, PROD_2, and 
PROD_5 were greater than those in the north field. 
Although the effects of QTLs for PROD_4 would be 
expected to be consistent, they did not seem to be related 
in the populations. The frequency of pods with only one 
or two seeds on plants in the north field was greater than 
for plants in the south field (data not shown), which may 
explain this inconsistency. 

Flowering phenology 

Four QTLs for flowering phenology were detected in each 
population (Fig. 3, Appendix A5). In general, G. max 
alleles at these QTLs had additive effects of hastening 
FLOW (Add, 0 to -3.5 days; PVE, 5.2-54.4%). In con- 
trast, the G. max allele of the QTL on LG-Dlb in the 
Wl x Dl population delayed FLOW (Fig. 3A). The loca- 
tion of the QTL identified on LG-L in the Wl x Dl 
population (Add, -1.8 to -2.4 days; PVE, 14.1-21.3%) 
was very near that of a QTL identified in the W2 x D2 
population (Add, -0.8 day; PVE, 5.5% [Fig. 3B]). The 
locations of QTLs with large effects on FLOW were differ- 
ent between the two populations; in the Wl x Dl 
population, a major QTL was found on LG-O (Add, -2.8 
to -3.5 days; PVE, 29.6-54.4%), whereas in the Wl x Dl 
population, a major QTL was found on LG-H (Add, —1.8 
to -3.2 days; PVE, 26.6-36.8%). 

QTL analyses for BC^ populations 

Linkage maps for Wl x Dl and W2 x D2 BC^ popu- 
lations were constructed using 214 and 199 markers, 
respectively (Table 1, Fig. 3). In total, 8 and 20 QTLs 
were detected in the Wl x Dl and W2 x D2 popula- 
tions, respectively (Fig. 3, Appendix A5). 

Seed dormancy 

Three QTLs for seed dormancy were detected in each 
population (Fig. 3A, Appendix A5). Although the QTLs 
for DORM_2 on LG-A2 were detected only in the 
W2 x D2 population, the QTLs for DORM_l on LG-A2 
were detected across combinations (Wl x Dl and 



W2 x D2), suggesting that the G. max alleles have a con- 
sistent genetic effect even within a high percentage of wild 
genetic background. The G. max allele at this QTL on 
LG-A2 had a large negative effect on DORM_l (Add, 
-7% to -8%; PVE, 21.2-22.3%). 

Seed production 

Three and 15 QTLs related to seed production were 
detected in the Wl x Dl and W2 x D2 BQFj popula- 
tions, respectively (Fig. 3, Appendix A5). No QTL was 
common between the F 2 and BC^ generations. For traits 
PROD_l, PROD_2, PROD_4, and PROD_5, no QTLs 
were detected in the Wl x Dl population, but several 
QTLs different from those found in the F 2 generation 
were identified in the W2 x D2 population. As in the F 2 
generation, most G. max alleles had the effect of decreas- 
ing seed production-related traits, that is, at QTLs for 
PROD_5 (Add, -8.6 to -11.6 g; PVE, 5.1-9.4%) and 
PROD_6 (Add, -21 to -62 cm; PVE, 7.5-20.2%). How- 
ever, some G. max alleles at those QTLs from the 
W2 x D2 population had the effect of increasing 
PROD_l on LG-A2 (Add, +354 seeds; PVE, 5.5%), 
PROD_2 on LG-A2 (Add, +17.9 g; PVE, 7.4%), PROD_4 
on LG-E (Add, +113 pods; PVE, 5.1%) and on LG-M 
(Add, +112 pods; PVE, 4.9%), and PROD_6 on LG-A2 
and LG-O (Add, +20 to +21 cm; PVE, 6.9-7.9%). The G. 
max alleles at all QTLs detected had the effect of increas- 
ing PROD_3. 

Flowering phenology 

Two QTLs for FLOW were detected in each population 
(Fig. 3B, Appendix A5). Although the QTLs in the two 
populations were different, the G. max allele at each one 
had the effect of delayed flowering time (FLOW, -1.8 to 
-5.0 days; PVE, 11.2-42.5%). There were QTLs in com- 
mon between F 2 and BCiFi on LG-O in the Wl x Dl 
population (FLOW, -2.8 to -5.0 days; PVE, 29.6-54.4%) 
and on LG-H in the W2 x D2 population (FLOW, -1.8 
to -3.2 days; PVE, 15.3-36.8%). 

QTL analyses for BC 2 F 1 and BC 1 F 2 
populations 

The linkage maps for Wl x Dl and W2 x D2 BC^ pop- 
ulations were constructed by using 103 and 72 markers, 
respectively (Table 1, Fig. 3). These markers were located in 
the heterozygous regions in the selected BCiFi plants. In 
addition, BCiF 2 populations were developed by using seeds 
from self-pollination of the two selected BC^ plants 
(Wl x Dl and W2 x D2) and partial linkage maps were 
constructed. The linkage maps for the Wl x Dl and 
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W2 x D2 BCiF 2 populations were constructed by using 
105 and 72 markers, respectively. The order of markers in 
each linkage map was well conserved between the Wl x Dl 
and W2 x D2 populations as well as among the F 2 , BC^, 
and BC 2 F! populations (Fig. 3A and B). Entire linkage 
groups (LG-A1, -CI, -I, and -M in Wl x Dl and LG-C1, 
-B2, -D2, -G, -J, and -O in W2 x D2) were found to have 
been replaced with G. soja genome in the two selected 
BC^! plants, BC^ population and BC^ population. 

In the BC^ generation, which had a higher percentage 
of G. soja genetic background than the BCiF 1; but 
included the selected fitness-related alleles from G. max, 
10 QTLs were detected in both the Wl x Dl and 
W2 x D2 populations (Fig. 3, Appendix A5). Similar to 
the BC^ generation, most QTLs in the BC^ generation 
were different from those detected in the F 2 generation. 
Unlike the situation in the BQF! generation, the effects 
of DORM_l and DORM_2 QTLs on LG-A2 were not 
detected in Wl x Dl combination (Fig. 3A). 

In the BC^ generation, which had a similar percent- 
age of G. soja background to the BC^ generation but 
was homozygous for selected fitness-related alleles from 
G. max, 19 and 17 QTLs were detected in the Wl x Dl 
and W2 x D2 populations, respectively (Fig. 3, Appendix 
A5). The major QTLs for seed dormancy on LG-A2, C2, 
and Dlb (Fig. 3A) and for seed number on LG-L 
(Fig. 3B) were well conserved between the F 2 and BCiF 2 
generation, except for the DORM_l QTL on LG-A2 in 
the Wl x Dl population, which was present in the F 2 
but not detected in the BQF 2 generation. 

Discussion 

Life history in relation to hybrid derivatives 

In a previous study, hybrid derivatives that had arisen from 
gene flow between G. soja and G. max were grown in 
several natural habitats in Japan (Kuroda et al. 2010). 
Because the hardness of the seed coat, a phenotype related 
to seed dormancy (Table 4), is largely determined by the 
phenotype of the maternal G. soja plant, Fj seeds produced 
by pollen from G. max can survive in the soil several years, 
and the Fj plants can grow in the wild with G. soja. Here, 
FLOW of Fj hybrids tended to be similar to that of G. soja 
parent or intermediate between G. soja and G. max parent 
(Table 3), indicating that the flowering of natural F x 
hybrids and local G. soja could overlap in several parts of 
Japan where natural hybrids have been identified. Due to 
genetic segregation in the F 2 progenies, the extent of over- 
lapping flowering time with G. soja will be reduced in that 
generation. However, once secondary gene flow from the 
F! hybrid to G. soja has occurred, most of the backcross 
progenies are expect to have flowering time relatively 
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similar to that of G. soja (Table 3). As the outcrossing rate 
in wild soybean populations has been reported to be 9.3- 
19% (Fujita et al. 1997) and 0-6.3% (Kuroda et al. 2008), 
our results suggest that G. max alleles can persist at some 
frequency in wild populations as long as gene flow continu- 
ously occurs at or near the maximum frequency. 

Under the experimental field conditions, the total seed 
number (PROD_l) of the Fj hybrid was similar to or less 
than that of the corresponding G. soja parent (Table 3). 
PROD_l of most F 2 progenies was usually less than that 
of G. soja, although some F 2 individuals revealed a similar 
or greater PROD_l than the G. soja parent (Fig. 2A and 
B). As the proportion of G. soja background increased 
through backcrossing, the frequency of hybrid derivatives 
that revealed similar PROD_l to G. soja also increased 
(Fig. 2C-F). However, after one round of self-pollination 
of the BC^j progenies, BCiF 2 plants with short plant 
height and low seed production, as was seen in the F 2 
progenies, appeared again (Fig. 2G and H). 

Most G. max seeds died in the soil during the winter, 
whereas the G. soja seeds survived (DORM_l, Table 3). 
Although DORM_l of the F! hybrids was intermediate 
between G. max and G. soja, F 2 progenies revealed wide 
variation in DORM_l (Fig. 2A and B). The extent of 
DORM_l of the F 2 progenies was related to the seed 
color (Table 4). As the proportion of G. soja background 
was increased by backcrossing with G. soja, the seed 
morphology (i.e., seed coat color and size) became closer 
to that of G. soja, and DORM_l of the BC^ progenies 
increased (Fig. 2C-F). However, after one round of self- 
pollination of the BCiFj progenies, BC^ seed/plants 
with low DORM_l appeared (Fig. 2G and H). To under- 
stand this further, the phenotypic variation observed in 
the hybrid progenies was genetically dissected into QTLs 
by constructing genetic linkage maps. 

Seed dormancy-related QTLs 

Seedling emergence represents the interface between two 
demographic events: seed production and seedling 
recruitment. Because seed dormancy-related traits deter- 
mine the timing of seedling emergence, the physiology of 
seed dormancy has a large effect on fitness. Good water 
permeability is an important trait for uniform and rapid 
germination in G. max cultivation and food processing. 
Conversely, rapid water uptake is known to lead to cell 
damage in the cotyledon (Powell and Matthews 1978) 
and is disadvantageous to survival of G. soja during win- 
ter in natural habitats. The physiological difference has 
been characterized by many researchers who have mea- 
sured traits such as seed water imbibition or seed hard- 
ness during several days under germinable conditions. 
However, evaluation of seed dormancy is generally quite 
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different between artificial and natural conditions in 
terms of time, water, and temperature conditions. Even 
G. max seed, which imbibes water during winter, could 
survive winter in 2006 (Table 3), indicating that water 
imbibition does not always lead to loss of seed viability. In 
this study, three major QTLs affecting both DORM_l and 
DORM_2, which are located on LG-A2, -C2, and -Dlb 
(Fig. 3A), were generally consistent over generations and 
crossing combinations. A significant high correlation 
between DORM_l and DORM_2 was observed (Appendix 
A4): seeds from hybrid derivatives that had G. max alleles 
at those QTLs imbibed water easily and appear to have rot- 
ted in the soil over the winter. In particular, the G. max 
allele for DORM_l on LG-A2 was found to be partially 
dominant to the G. soja allele because its effect appeared in 
BCiF! progenies and it had a large effect of reducing sur- 
vival rate in the W2 x D2 population (Appendix A5B). 
Therefore, the effect of such strong G. max alleles may lead 
to reduced winter survival of the seeds produced by an ~F 1 
hybrid plant as well as by later-generation progenies. 

Nevertheless, the magnitudes of allele effects at the 
three major DORM_l/DORM_2 QTLs were slightly dif- 
ferent depending on the cross combination. For example, 
the effect of the QTL on LG-A2 was strongest among the 
three QTLs in the W2 x D2 population, whereas it was 
similar to that of the other two QTLs in the Wl x Dl 
population (Fig. 3A). This explains the different level of 
seed winter survival between the Wl x Dl and 
W2 x D2 combinations. All the previously reported 
QTLs had the effect of causing water imbibition when the 
alleles at those loci were from G. max. In a G. max x G. 
soja population, Keim et al. (1990) detected four QTLs 
on LG-A2, -L, and -Dlb by evaluating imbibition of F 4 
seeds for 7 days at room temperature. In contrast, Sakam- 
oto et al. (2004) and Liu et al. (2007) identified two 
QTLs, located in LG-C2 and -Dlb, by evaluating imbibi- 
tion of seeds for 12 h and 24 h at room temperature, 
respectively. Glycine gracilis is an intermediate form 
between G. max and G. soja that originated in northeast- 
ern China (Hymowitz 2004). Three QTLs (on LG-C2, - 
Dlb, and -I) were identified in a G. max x G. gracilis 
population by testing imbibition of seeds for 24 h at 
25°C (Watanabe et al. 2004). These results indicate that 
QTLs on LG-C2 and -Dlb are common among G. max 
x G. soja populations, but that a QTL on LG-A2 is not 
consistently detected in such populations. Similarly, in 
this study, no QTL for seed hardness (DORM_2) was 
detected on LG-A2 in the Wl x Dl population 
(Fig. 3A). It is very interesting that QTLs for seed winter 
survival (DORM_l), which required a long-term evalua- 
tion in the field, were successfully identified in the 
Wl x Dl combination in approximately the same region 
on LG-A2 where QTLs for DORM_l and DORM_2 were 
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detected in the W2 x D2 combination. One possible 
explanation of this finding is that the effect of a QTL on 
LG-A2 may appear when seeds imbibe water during long- 
term evaluation if the seed coat of G. max has resistance 
to water imbibition. The slow imbibition rate seen for Dl 
parent also supports this explanation and suggests that 
there is allelic variation within G. max for a seed hardness 
QTL on LG-A2. 

Based on the map locations of gene-derived markers 
and the magnitude of QTL effects, the DORM_l/ 
DORM_2 QTLs on LG-A2 and LG-C2 are tightly linked 
to the I locus and T locus, respectively, and the genes 
responsible for DORM_l are either I and T themselves or 
genes closely linked to those loci (Fig. 3A). The / allele, 
which suppresses seed coat pigmentation, is dominant to 
the i allele, and the T allele, which confers pigment 
pubescence, is dominant to the t allele (Bernard and 
Weiss 1973). Hybrid derivatives without black seed coat 
(i.e., those with the I allele) showed low seed survival 
(Table 4), and, thus, the I allele is related to the water 
imbibition ability of G. max, which might be due to a 
physical characteristic of the seed coat. Epistatic interac- 
tion between the I and X loci has been reported to cause 
seed coat cracking when the alleles at both / and T locus 
are recessive and homozygous (Lindstrom and Vodkin 
1991). Such cracked F 3 seeds produced from several F 2 
individuals imbibed water quickly and failed to survive 
during winter (Table 4). Thus, epistatic interactions 
account for the reduced fitness of progenies derived from 
self-pollination, in spite of a low proportion of double- 
recessive individuals in the progenies, through their influ- 
ence on seed viability or survival. 

Seed production-related QTLs 

The genes for domestication-related traits, which differen- 
tiate between crops and their wild relatives, are not ran- 
domly distributed across crop genomes (Ross-Ibarra 
2005; Kaga et al. 2008). In this study, QTLs with high 
contributions to seed production-related traits, represent- 
ing distinct differences between G. soja and G. max, 
tended to be concentrated in a particular genomic region 
on LG-L (Fig. 3B). Those QTLs were common between 
different cross combinations (Wl x Dl and W2 x D2) 
as well as across different generations. One possible 
reason for the positive, high correlation of total number 
of seed (PROD_l) with traits related to plant size such as 
stem dry weight (PROD_5) and stem length (PROD_6 
[Appendix A4]) would be a gene related to stem elonga- 
tion. Classically, stem termination in soybean is known 
to be controlled by two loci, Dfl and Dt2 (Bernard 
1972). The determinate stem type (dtl allele) shows little 
growth in stem length after flowering, whereas the inde- 
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terminate stem type (Dfl allele) continues to elongate 
even after flowering. An intermediate phenotype, called 
semideterminate, is conditioned by the Dt2 locus 
(Bernard 1972). Because Dfl and Df2 have been mapped 
on LG-L and LG-G, respectively (Cregan et al. 1999), the 
QTL with a strong contribution to stem length 
(PROD_6) on LG-L in this study is likely to be the Dfl 
locus (Fig. 3B). Our results indicate that the G. max allele 
at this locus has the effect of reducing the number of 
seeds produced by hybrids between G. max and G. soja, as 
previously reported by Wang et al. (2004). Intriguingly, 
QTLs for seed weight (PROD_3) as well as other seed pro- 
duction-related traits were closely linked to marker LFsoy3, 
which was designed to detect a soybean homolog of 
PsTFLla, a gene-controlling stem termination in Visum 
(Foucher et al. 2003). Further studies are necessary to clar- 
ify the pleiotropic effect of soybean TFLla on these traits. 
The G. max allele at the QTL for PROD_6 on LG-L was 
confirmed to have a moderate negative effect in the BC^ 
and BC 2 F! populations, but it had no effect on PROD_l as 
was found in the progenies from self-pollination (Fig. 3B, 
Appendix A5). A QTL for both PROD_6 and PROD_l was 
identified again on LG-L in the BC^ population. These 
results indicate that the G. max allele is recessive to the G. 
soja allele because its effects were detected only in progenies 
generated by self-pollination. 

Flowering phenology-related QTL 

Photosensitivity is also an important plant response that 
is heavily involved in the control of flowering as well as 
in successful seed production. There were clear differ- 
ences between the Wl x Dl and W2 x D2 populations 
in terms of both days to first flower (FLOW) (Table 3). 
The Wl x Dl population, representing northern Japanese 
germplasm, had shorter FLOW than the W2 x D2 popula- 
tion, representing southern Japanese germplasm. This 
difference reflects the adaptive strategy of G. soja and G. 
max in Japan. In northern Japan, the growing season is rel- 
atively short; thus, the Wl x Dl population might 
respond to warm temperatures and start to produce seeds 
during the short period of moderate climate even if the 
plants are not large. In contrast, the W2 x D2 population 
might respond to photoperiod and start to produce seeds 
only after the plants have grown large because autumn is 
relatively long in southern Japan. 

Based on the location of SSR markers linked to previ- 
ously reported flowering loci, the FLOW QTLs on LG-O 
(Wl x Dl population), -L (Wl x Dl and W2 x D2 
population), and -I (Wl x Dl population) found in this 
study (Fig. 3B) are thought to be the classical maturity loci 
E2 (Bernard 1971), E3 (Buzzell 1971), and E4 (Buzzell and 
Voldeng 1980). The other FLOW QTLs with a large effect 



Y. Kuroda ef al. 

(i.e., that on LG-H) or with a moderate effect (i.e., on LG- 
E and -F in the W2 x D2 population [Appendix A5B] and 
on LG-Dlb and -K in the Wl x Dl population [Appendix 
A5A]) have not been previously described and might be 
new loci for flowering time in soybean. Although a QTL 
for days to flowering on LG-C2 has been reported in a G. 
max x G. gracilis population (Yamanaka et al. 2001; Wa- 
tanabe et al. 2004) and in a G. max x G. soja population 
(Liu et al. 2007), no flowering time QTL at that location 
was consistently identified in this study. 

Evolutionary aspect of fitness-related QTLs 
and conclusions 

Natural selection is expected to occur on the pheno- 
types of individuals that constitute G. soja populations, 
including hybrids between G. soja and G. max. Moreover, 
the phenotype of the hybrid progenies is influenced by 
the genetic variability of both G. max and G. soja, in 
response to a heterogeneous environment such as the nat- 
ural habitat of G. soja. The results obtained here should 
be considered as an estimate obtained under conditions of 
maximum plant growth and seed production because the 
hybrid derivatives were widely spaced in the field (i.e., at 
intervals of 1 m); the results might have been different if 
the plants had been evaluated under conditions favoring 
high mortality of seedlings and restricted seed production 
in the competitive native weed population. 

Genotype-dependent phenotypic response to different 
environments is common to quantitative traits and is 
referred to as phenotypic plasticity (Bradshaw 1965). In 
particular, the genes for wide adaptability that might have 
accumulated during human selection of G. max are prob- 
ably different from those accumulated during ecological 
adaptation of G. soja, and they are likely to control more 
than the obvious morphological differences between the 
two species. For this reason, the effects of G. max genes 
were examined in this study in two types of hybrids 
between G. soja and G. max and were tested in two 
regions of Japan. 

A large number of genes and their interactions with 
environmental changes during plant growth are thought to 
influence seed production. Nevertheless, the only QTL 
with a strong effect on PROD_l between G. soja and 
G. max across different regions was the one identified on 
LG-L (Fig. 3B). The limited ability to detect QTLs 
involved in complex epistatic interactions might have led 
to underestimation of the number of loci involved in 
PROD_l because QTLs for traits such as PROD_4 and 
FLOW that might be expected to affect PROD_l were not 
always detected as QTLs for PROD_l. 

Until recently, little has been known about the effect of 
G. max alleles within a predominantly G. soja genetic 
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background. In this study, the genetic effects of those G. 
max alleles were not expressed as phenotypes in the BC^ 
and BC 2 F! generations, indicating that most G. soja alleles 
are dominant to G. max alleles; one notable exception was 
the QTL for seed dormancy on LG-A2 (Fig. 3A). Snow 
et al. (1999) indicated that after two or three generation of 
backcrossing, hybrid derivatives in which crop alleles have 
been introgressed can be just as competitive and successful 
as wild plants. In this study, PROD_l and DORM_l in 
the BQFj and BC 2 F! generation approached the values for 
G. soja as the proportion of G. soja genetic background 
increased (Table 3). Although QTLs at which G. max 
alleles had the increasing effect on PROD_l and DORM_l 
were not consistent over generations and crossing combi- 
nations (Fig. 3, Appendix A5), these alleles may have the 
potential to increase the fitness of hybrid derivatives. Indi- 
vidual plants that had higher fitness than G. soja in terms 
of SURV could be found in most generations of both the 
Wl x Dl and W2 x D2 populations (Fig. 2). 

In contrast, QTLs at which G. max alleles had negative 
effects on fitness were consistently detected in both cross 
combinations and in different generations. In particular, 
QTLs for DORM_l on LG-A2, -C2, and -Dlb (Fig. 3A) 
and for PROD_l on LG-L (Fig. 3B) were found in both 
cross combinations. This is one reason why hybrid deriva- 
tives do not survive in natural habitats (Kaga et al. 2005; 
Kuroda et al. 2005, 2006b, 2007), and why genetic differ- 
entiation is maintained between G. soja and G. max (Mau- 
ghan et al. 1996; Powell et al. 1996; Xu and Gai 2003; 
Kuroda et al. 2006a). Previously, it was reported that 
hybrids between wild and crop species should be less fit 
than their wild parents due to the burden that crop traits 
would introduce into wild plants (De Wet and Harlan 
1975). Current knowledge of the genetic basis of domesti- 
cation traits suggests that few genomic regions are usually 
involved in domestication (White and Doebley 1998; 
Gross and Olsen 2010); thus, these regions could be 
purged quite rapidly with no long-term impact on fitness 
within the first few generations after hybridization. Our 
results support these studies and suggest that the risk of 
transgene dispersal into the wild soybean gene pool is 
generally low in Japan. The simulation studies as to what 
extent G. max alleles persist under a mixed mating system 
(i.e., considering the relative proportions of progenies both 
from self-fertilization and from outcrossing events) is 
required to improve the assessment of environmental 
transgene dispersal from GM soybeans. 
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